Predicting protein structure classes from function predictions
نویسندگان
چکیده
MOTIVATION We introduce a new approach to using the information contained in sequence-to-function prediction data in order to recognize protein template classes, a critical step in predicting protein structure. The data on which our method is based comprise probabilities of functional categories; for given query sequences these probabilities are obtained by a neural net that has previously been trained on a variety of functionally important features. On a training set of sequences we assess the relevance of individual functional categories for identifying a given structural family. Using a combination of the most relevant categories, the likelihood of a query sequence to belong to a specific family can be estimated. RESULTS The performance of the method is evaluated using cross-validation. For a fixed structural family and for every sequence, a score is calculated that measures the evidence for family membership. Even for structural families of small size, family members receive significantly higher scores. For some examples, we show that the relevant functional features identified by this method are biologically meaningful. The proposed approach can be used to improve existing sequence-to-structure prediction methods. AVAILABILITY Matlab code is available on request from the authors. The data are available at http://www.mpisb.mpg.de/~sommer/Fun2Struc/
منابع مشابه
Hierarchical Multilabel Protein Function Prediction Using Local Neural Networks
Protein function predictions are usually treated as classification problems where each function is regarded as a class label. However, different from conventional classification problems, they have some specificities that make the classification task more complex. First, the problem classes (protein functions) are usually hierarchically structured, with superclasses and subclasses. Second, prot...
متن کاملFLORA: A Novel Method to Predict Protein Function from Structure in Diverse Superfamilies
Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particul...
متن کاملImproving the Prediction of Protein Secondary Structure in Three and Eight Classes Using Recurrent Neural Networks and Pro les
Secondary structure predictions are increasingly becoming the workhorse for several methods aiming at predicting protein structure and function. Here we use ensembles of bidirectional recurrent neural network architectures, PSIBLAST-derived pro les, and a large non-redundant training set to derive two new predictors: (1) the second version of the SSpro program for secondary structure classi cat...
متن کاملPredicting enzyme class from protein structure without alignments.
Methods for predicting protein function from structure are becoming more important as the rate at which structures are solved increases more rapidly than experimental knowledge. As a result, protein structures now frequently lack functional annotations. The majority of methods for predicting protein function are reliant upon identifying a similar protein and transferring its annotations to the ...
متن کاملImproving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles.
Secondary structure predictions are increasingly becoming the workhorse for several methods aiming at predicting protein structure and function. Here we use ensembles of bidirectional recurrent neural network architectures, PSI-BLAST-derived profiles, and a large nonredundant training set to derive two new predictors: (a) the second version of the SSpro program for secondary structure classific...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 20 5 شماره
صفحات -
تاریخ انتشار 2004